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RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/918, 288 



DATE: 06/09/2000 
TIME: 05:01:51 



INPUT SET: S3 5603. raw 



This Raw Listing contains the General 
Information Section and up to the first 5 pages* 
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SEQUENCE LISTING 



(1) 



General Information 



(i) APPLICANT: BOIME, Irving 

MOYLE, William R. 



ENTERED 



(ii) TITLE OF THE INVENTION: SINGLE -CHAIN FORMS OF THE 

GLYCOPROTEIN HORMONE QUARTET 

(iii) NUMBER OF SEQUENCES: 83 

( iv) CORRESPONDENCE ADDRESS : 



(A 
(B 
(C 
(D 

(e: 

(F 



(v) COMPUTER READABLE FORM: 



(A 
(B 

(c: 
(d: 

(vi) 
(A 
(B 
(C 

(vii 
(A 
(B 

(A 
(B 

(A 
(B 



ADDRESSEE: MORRISON & FOERSTER 

STREET: 2000 Pennsylvania Avenue, NW, suite 5500 
CITY: Washington' 
STATE: DC 
COUNTRY: USA 
ZIP: 20006-1888 



MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 
APPLICATION NUMBER: 08/918,288 
FILING DATE: 
CLASSIFICATION: 

PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 09/282,357 
FILING DATE: 

APPLICATION NUMBER: 08/853,524 
FILING DATE: 09-MAY-1997 

APPLICATION NUMBER: 08/199,382 
FILING DATE: 18-FEB-1994 



(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Murashige, Kate H 

(B) REGISTRATION NUMBER: 29,959 

(C) REFERENCE/DOCKET NUMBER: 29500-20050.25 
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RAW SEQUENCE LISTING DATE: 06/09/2000 

PATENT APPLICATION US/08/918,288 TIME: 05:01:52 

INPUT SET: S3 5603. raw 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-887-1500 

(B) TELEFAX: 2-02-887-0763 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



Ser Ser Ser Ser Lys Ala Pro Pro Pro Ser Leu Pro Ser Pro Ser Arg 

15 10 15 

Leu Pro Gly Pro Ser Asp Thr Pro lie Leu Pro Gin 
20 25 

(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 836 base pairs 

(B) TYPE: nucleic acid 

(C) • STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ix) FEATURE: 



(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 33... 827 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

ATGAAATCGA CGGAATCAGA CTCGAGCCAA GG ATG GAG ATG TTC CAG GGG CTG 53 

Met Glu Met Phe Gin Gly Leu 

1 5 



CTG CTG TTG CTG CTG CTG AGC ATG GGC GGG ACA TGG GCA TCC AAG GAG 101 
Leu Leu Leu Leu Leu Leu Ser Met Gly Gly Thr Trp Ala Ser Lys Glu 
10 15 20 

CCG CTT CGG CCA CGG TGC CGC CCC ATC AAT GCC ACC CTG GCT GTG GAG 14 9 

Pro Leu Arg Pro Arg Cys Arg Pro lie Asn Ala Thr Leu Ala Val Glu 
25 30 35 



AAG GAG GGC TGC CCC GTG TGC ATC ACC GTC AAC ACC ACC ATC TGT GCC 
Lys Glu Gly Cys Pro Val Cys He Thr Val Asn Thr Thr He Cys Ala 



197 



PAGE: 3 RAW SEQUENCE LISTING DATE: 06/09/2000 

PATENT APPLICATION US/08/918,288 TIME: 05:01:53 

INPUT SET: S35603.raw 

100 40 45 50 55 

101 

102 GGC TAC TGC CCC ACC ATG ACC CGC GTG CTG CAG GGG GTC CTG CCG GCC 24 5 

103 Gly Tyr Cys Pro Thr Met Thr Arg Val Leu Gin Gly Val Leu Pro Ala 

104 60 65 70 
105 

106 CTG CCT CAG GTG GTG TGC AAC TAC CGC GAT GTG CGC TTC GAG TCC ATC 2 93 

107 Leu Pro Gin Val Val Cys Asn Tyr Arg Asp Val Arg Phe Glu Ser lie 

108 75 80 85 
109 

110 CGG CTC CCT GGC TGC CCG CGC GGC GTG AAC CCC GTG GTC TCC TAC GCC 341 

111 Arg Leu Pro Gly Cys Pro Arg Gly Val Asn Pro Val Val Ser Tyr Ala 

112 90 95 100 
113 

114 GTG GCT CTC AGC TGT CAA TGT GCA CTC TGC CGC CGC AGC ACC ACT GAC 389 

115 Val Ala Leu Ser Cys Gin Cys Ala Leu Cys Arg Arg Ser Thr Thr Asp 

116 105 110 115 
117 

% 118 TGC GGG GGT CCC AAG GAC CAC CCC TTG ACC TGT GAT GAC CCC CGC TTC 437 

119 Cys Gly Gly Pro Lys Asp His Pro Leu Thr Cys Asp Asp Pro Arg Phe 

120 120 125 130 135 
121 

122 CAG GAC TCC TCT TCC TCA AAG GCC CCT CCC CCC AGC CTT CCA AGC CCA 485 

123 Gin Asp Ser Ser Ser Ser Lys Ala Pro Pro Pro Ser Leu Pro Ser Pro 

124 140 145 150 
125 

126 TCC CGA CTC CCG GGG CCC TCG GAC ACC CCG ATC CTC CCC CAA GGA TCC 533 

127 Ser Arg Leu Pro Gly Pro Ser Asp Thr Pro lie Leu Pro Gin Gly Ser 

128 155 160 165 
129 

130 GGT AGC GGA TCT GGT AGC GCT CCT GAT GTG CAG GAT TGC CCA GAA TGC 581 

131 Gly Ser Gly Ser Gly Ser Ala Pro Asp Val Gin Asp Cys Pro Glu Cys 

132 170 175 180 
133 

134 ACG CTA CAG GAA AAC CCA TTC TTC TCC CAG CCG GGT GCC CCA ATA CTT 62 9 

135 Thr Leu Gin Glu Asn Pro Phe Phe Ser Gin Pro Gly Ala Pro lie Leu 

136 185 190 195 
137 

13 8 CAG TGC ATG GGC TGC TGC TTC TCT AGA GCA TAT CCC ACT CCA CTA AGG 677 
13 9 Gin Cys Met Gly Cys Cys Phe Ser Arg Ala Tyr Pro Thr Pro Leu Arg 
140 200 205 210 215 

141 

142 TCC AAG AAG ACG ATG TTG GTC CAA AAG AAC GTC ACC TCA GAG TCC ACT 725 

143 Ser Lys Lys Thr Met Leu Val Gin Lys Asn Val Thr Ser Glu Ser Thr 

144 220 225 230 
145 

146 TGC TGT GTA GCT AAA TCA TAT AAC AGG GTC ACA GTA ATG GGG GGT TTC 773 

147 Cys Cys Val Ala Lys Ser Tyr Asn Arg Val Thr Val Met Gly Gly Phe 

148 235 240 245 
149 

15 0 AAA GTG GAG AAC CAC ACG GCG TGC CAC TGC AGT ACT TGT TAT TAT CAC 821 

151 Lys Val Glu Asn His Thr Ala Cys His Cys Ser Thr Cys Tyr Tyr His 

152 250 255 260 
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RAW SEQUENCE LISTING DATE: 06/09/2000 

PATENT APPLICATION US/08/918,288 TIME: 05:01:54 

INPUT SET: S35603.raw 

AAA TCT TAAGGTACC 8 36 

Lys Ser 
265 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 265 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



Met 


Glu 


Met 


Phe 


Gin 


Gly 


Leu 


Leu 


Leu 


Leu 


Leu 


Leu 


Leu 


Ser 


Met 


Gly 


1 








5 










10 










15 




Gly Thr 


Trp 


Ala 


Ser 


Lys 


Glu 


Pro 


Leu 


Arg 


Pro 


Arg 


Cys 


Arg 


Pro 


He 








20 










25 










30 






Asn 


Ala 


Thr 


Leu 


Ala 


Val 


Glu 


Lys 


Glu 


Gly 


Cys 


Pro 


Val 


Cys 


He 


Thr 






35 










40 










45 








Val 


Asn 


Thr 


Thr 


He 


Cys 


Ala 


Gly 


Tyr 


Cys 


Pro 


Thr 


Met 


Thr 


Arg 


Val 




50 










55 










60 










Leu 


Gin 


Gly 


Val 


Leu 


Pro 


Ala 


Leu 


Pro 


Gin 


Val 


Val 


Cys 


Asn 


Tyr 


Arg 


65 










70 










75 










80 


Asp 


Val 


Arg 


Phe 


Glu 


Ser 


He 


Arg 


Leu 


Pro 


Gly 


Cys 


Pro 


Arg 


Gly 


Val 










85 










90 










95 




Asn 


Pro 


Val 


Val 


Ser 


Tyr 


Ala 


Val 


Ala 


Leu 


Ser 


Cys 


Gin 


Cys 


Ala 


Leu 








100 










105 










110 






Cys 


Arg 


Arg 


Ser 


Thr 


Thr 


Asp 


Cys 


Gly 


Gly 


Pro 


Lys 


Asp 


His 


Pro 


Leu 






115 










120 










125 








Thr 


Cys 


Asp 


Asp 


Pro 


Arg 


Phe 


Gin 


Asp 


Ser 


Ser 


Ser 


Ser 


Lys 


Ala 


Pro 




130 










135 










140 










Pro 


Pro 


Ser 


Leu 


Pro 


Ser 


Pro 


Ser 


Arg 


Leu 


Pro 


Gly 


Pro 


Ser 


Asp 


Thr 


145 










150 










155 










160 


Pro 


He 


Leu 


Pro 


Gin 


Gly 


Ser 


Gly 


Ser 


Gly 


Ser 


Gly Ser 


Ala 


Pro 


Asp 










165 










170 










175 




Val 


Gin 


Asp 


Cys 


Pro 


Glu 


Cys 


Thr 


Leu 


Gin 


Glu 


Asn 


Pro 


Phe 


Phe 


Ser 








180 










185 










190 






Gin 


Pro 


Gly 


Ala 


Pro 


He 


Leu 


Gin 


Cys 


Met 


Gly 


Cys 


Cys 


Phe 


Ser 


Arg 






195 










200 










205 








Ala 


Tyr 


Pro 


Thr 


Pro 


Leu 


Arg 


Ser 


Lys 


Lys 


Thr 


Met 


Leu 


Val 


Gin 


Lys 




210 










215 










220 










Asn 


Val 


Thr 


Ser 


Glu 


Ser 


Thr 


Cys 


Cys 


Val 


Ala 


Lys 


Ser 


Tyr 


Asn 


Arg 


225 










230 










235 










240 


Val 


Thr 


Val 


Met 


Gly 


Gly 


Phe 


Lys 


Val 


Glu 


Asn 


His 


Thr 


Ala 


Cys 


His 










245 










250 










255 




Cys 


Ser 


Thr 


Cys 


Tyr 


Tyr 


His 


Lys 


Ser 
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RAW SEQUENCE LISTING DATE: 06/09/2000 

PATENT APPLICATION US/08/918,288 TIME: 05:01:55 

INPUT SET; S3 5603. raw 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TCCGGATTAG CTTGAGATGG ATCCGGTACC TTAAGATTTG TGATAATAAC AAGTACTGCA 6 0 

GTGGCACGCC GTGTGGTTCT CCACTTTGAA ACCCCCCATT ACTGTGACCC TGTTATATGA 120 

TTTAGCTACA CAGCAAGTGG ACTCTGAGGT GACGTTCTTT TGGACCAACA TCGTCTTCTT 18 0 

GGACCTTAGT GGAGTGGGAT ATGCTCTAGA GAAGCAGCAG CCCATGCACT GAAGTATTGG 24 0 

GGCACCCGGC TGGGAGAAGA ATGGGTTTTC CTGTAGCGTG CATTCTGGGC AATCCTGCAC 3 00 

ATCAGGAGCG CTACCAGATC CGCTACCGGA TCCTTGGGGG AGGATCGGGG TGTCCGAGGG 360 

CCCCGGGAGT CGGGATGGGC TTGGAAGGCT GGGGGGAGGG GCCTTTGAGG AAGAGGAGTC 42 0 

CTGGAAGCGG GGGTCATCAC AGGTCAAGGG GTGGTCCTTG GGACCCCCGC AGTCAGTGGT 48 0 

GCTGCGGCGG CAGAGTGCAC ATTGACAGCT GAGAGCCACG GCGTAGGAGA CCACGGGGTT 54 0. 

CACGCCGCGC GGGCAGCCAG GGAGCCGGAT GGACTCGAAG CGCACATCGC GGTAGTTGCA 600 

CACCACCTGA GGCAGGGCCG GCAGGACCCC CTGCAGCACG CGGGTCATGG TGGGGCAGTA 660 

GCCGGCACAG ATGGTGGTGT TGACGGTGAT GCACACGGGG CAGCCCTCCT TCTCCACAGC 720 

CAGGGTGGCA TTGATGGGGC GGCACCGTGG CCGAAGCGGC TCCTTGGATG CCCATGTCCC 78 0 

GCCCATGCTC AGCAGCAGCA ACAGCAGCAG CCCCTGGAAC ATCTCCATCC TTGG 8 34 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 743 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 33 . . .734 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:, 

ATGAAATCGA CGGAATCAGA CTCGAGCCAA GG ATG GAG ATG TTC CAG GGG CTG 53 

Met Glu Met Phe Gin Gly Leu 
1 5 

CTG CTG TTG CTG CTG CTG AGC ATG GGC GGG ACA TGG GCA TCC AAG GAG 101 
Leu Leu Leu Leu Leu Leu Ser Met Gly Gly Thr Trp Ala Ser Lys Glu 
10 15 20 

CCG CTT CGG CCA CGG TGC CGC CCC ATC AAT GCC ACC CTG GCT GTG GAG 14 9 

Pro Leu Arg Pro Arg Cys Arg Pro lie Asn Ala Thr Leu Ala Val Glu 



SEQUENCE VERIFICATION REPORT 

PATENT APPLICATION US/08/918,288 



DATE: 06/09/2000 
TIME: 05:01:57 



INPUT SET: S35603.raw 



Original Text 



